Crate object_store
source ·Expand description
object_store
This crate provides a uniform API for interacting with object storage services and
local files via the the ObjectStore
trait.
Create an ObjectStore
implementation:
- Google Cloud Storage:
GoogleCloudStorageBuilder
- Amazon S3:
AmazonS3Builder
- Azure Blob Storage:
MicrosoftAzureBuilder
- HTTP Storage:
HttpBuilder
- In Memory:
InMemory
- Local filesystem:
LocalFileSystem
Adapters
ObjectStore
instances can be composed with various adapters
which add additional functionality:
- Rate Throttling:
ThrottleConfig
- Concurrent Request Limit:
LimitStore
List objects:
Use the ObjectStore::list
method to iterate over objects in
remote storage or files in the local filesystem:
use std::sync::Arc;
use object_store::{path::Path, ObjectStore};
use futures::stream::StreamExt;
// create an ObjectStore
let object_store: Arc<dyn ObjectStore> = Arc::new(get_object_store());
// Recursively list all files below the 'data' path.
// 1. On AWS S3 this would be the 'data/' prefix
// 2. On a local filesystem, this would be the 'data' directory
let prefix: Path = "data".try_into().unwrap();
// Get an `async` stream of Metadata objects:
let list_stream = object_store
.list(Some(&prefix))
.await
.expect("Error listing files");
// Print a line about each object based on its metadata
// using for_each from `StreamExt` trait.
list_stream
.for_each(move |meta| {
async {
let meta = meta.expect("Error listing");
println!("Name: {}, size: {}", meta.location, meta.size);
}
})
.await;
Which will print out something like the following:
Name: data/file01.parquet, size: 112832
Name: data/file02.parquet, size: 143119
Name: data/child/file03.parquet, size: 100
...
Fetch objects
Use the ObjectStore::get
method to fetch the data bytes
from remote storage or files in the local filesystem as a stream.
use std::sync::Arc;
use object_store::{path::Path, ObjectStore};
use futures::stream::StreamExt;
// create an ObjectStore
let object_store: Arc<dyn ObjectStore> = Arc::new(get_object_store());
// Retrieve a specific file
let path: Path = "data/file01.parquet".try_into().unwrap();
// fetch the bytes from object store
let stream = object_store
.get(&path)
.await
.unwrap()
.into_stream();
// Count the '0's using `map` from `StreamExt` trait
let num_zeros = stream
.map(|bytes| {
let bytes = bytes.unwrap();
bytes.iter().filter(|b| **b == 0).count()
})
.collect::<Vec<usize>>()
.await
.into_iter()
.sum::<usize>();
println!("Num zeros in {} is {}", path, num_zeros);
Which will print out something like the following:
Num zeros in data/file01.parquet is 657
Put object
Use the ObjectStore::put
method to save data in remote storage or local filesystem.
use object_store::ObjectStore;
use std::sync::Arc;
use bytes::Bytes;
use object_store::path::Path;
let object_store: Arc<dyn ObjectStore> = Arc::new(get_object_store());
let path: Path = "data/file1".try_into().unwrap();
let bytes = Bytes::from_static(b"hello");
object_store
.put(&path, bytes)
.await
.unwrap();
Multipart put object
Use the ObjectStore::put_multipart
method to save large amount of data in chunks.
use object_store::ObjectStore;
use std::sync::Arc;
use bytes::Bytes;
use tokio::io::AsyncWriteExt;
use object_store::path::Path;
let object_store: Arc<dyn ObjectStore> = Arc::new(get_object_store());
let path: Path = "data/large_file".try_into().unwrap();
let (_id, mut writer) = object_store
.put_multipart(&path)
.await
.unwrap();
let bytes = Bytes::from_static(b"hello");
writer.write_all(&bytes).await.unwrap();
writer.flush().await.unwrap();
writer.shutdown().await.unwrap();
Modules
- An object store implementation for S3
- An object store implementation for Azure blob storage
- A
ChunkedStore
that can be used to test streaming behaviour - Utility for streaming newline delimited files from object storage
- An object store implementation for Google Cloud Storage
- An object store implementation for generic HTTP servers
- An object store that limits the maximum concurrency of the wrapped implementation
- An object store implementation for a local filesystem
- An in-memory object store implementation
- Path abstraction for Object Storage
- An object store wrapper handling a constant path prefix
- A throttling object store wrapper
Structs
- Exponential backoff with jitter
- HTTP client configuration for remote object stores
- Options for a get request, such as range
- Result of a list call that includes objects, prefixes (directories) and a token for the next set of results. Individual result sets may be limited to 1,000 objects based on the underlying object storage’s limitations.
- The metadata that describes an object.
- Contains the configuration for how to respond to server errors
Enums
- A specialized
Error
for object store-related errors - Result for a get request
Traits
- Provides credentials for use when signing requests
- Universal API to multiple object store services.
Functions
- Create an
ObjectStore
based on the providedurl
- Create an
ObjectStore
based on the providedurl
and options
Type Definitions
- An alias for a dynamically dispatched object store implementation.
- Id type for multi-part uploads.
- A specialized
Result
for object store-related errors